Goto

Collaborating Authors

 remainder term





Supplementary Material for " Bootstrapping the error of Oja's algorithm "

Neural Information Processing Systems

In this document we provide the detailed proofs of results presented in the main manuscript. We also provide the Hoeffding decomposition for the bootstrap in Proposition A.4. In Section B we provide all results needed for a complete proof of Theorem 1. In Sections B.1, B.2, and B.3 we provide the proof of Theorem 1, the adaptation of high dimensional CLT of [8] to our setting and all supporting lemmas, respectively. In Section C we provide all details of the proof of the Bootstrap consistency, i.e.





Non-Asymptotic Uncertainty Quantification in High-Dimensional Learning

arXiv.org Artificial Intelligence

Uncertainty quantification (UQ) is a crucial but challenging task in many high-dimensional regression or learning problems to increase the confidence of a given predictor. We develop a new data-driven approach for UQ in regression that applies both to classical regression approaches such as the LASSO as well as to neural networks. One of the most notable UQ techniques is the debiased LASSO, which modifies the LASSO to allow for the construction of asymptotic confidence intervals by decomposing the estimation error into a Gaussian and an asymptotically vanishing bias component. However, in real-world problems with finite-dimensional data, the bias term is often too significant to be neglected, resulting in overly narrow confidence intervals. Our work rigorously addresses this issue and derives a data-driven adjustment that corrects the confidence intervals for a large class of predictors by estimating the means and variances of the bias terms from training data, exploiting high-dimensional concentration phenomena. This gives rise to non-asymptotic confidence intervals, which can help avoid overestimating uncertainty in critical applications such as MRI diagnosis. Importantly, our analysis extends beyond sparse regression to data-driven predictors like neural networks, enhancing the reliability of model-based deep learning. Our findings bridge the gap between established theory and the practical applicability of such debiased methods.


Finite-time High-probability Bounds for Polyak-Ruppert Averaged Iterates of Linear Stochastic Approximation

arXiv.org Artificial Intelligence

The LSA algorithm is central in statistics, machine learning, and linear systems identification, see e.g. the works Eweda and Macchi (1983); Widrow and Stearns (1985); Benveniste et al. (2012); Kushner and Yin (2003) and references therein. More recently, it has sparked a renewed interest in machine learning, especially for high-dimensional least squares and reinforcement learning (RL) problems; Bertsekas and Tsitsiklis (2003); Bottou et al. (2018); Sutton (1988); Bertsekas (2019); Watkins and Dayan (1992). The LSA and LSA-PR recursions (1) have been the subject of a wealth of work, and it is difficult to adequately acknowledge all contributions. Polyak and Juditsky (1992); Kushner and Yin (2003); Borkar (2008); Benveniste et al. (2012) provided asymptotic convergence guarantees (almost sure convergence, central limit theorem) under both i.i.d. and Markovian noise settings. In particular, it has been established that LSA-PR can accelerate LSA and satisfies a central limit theorem with an asymptotically minimax-optimal covariance matrix. Although asymptotic convergence analysis is of theoretical interest, the current trend is to obtain nonasymptotic guarantees that take into account both the limited sample size and the dimension of the parameter space. For these reasons, non-asymptotic analysis of both i.i.d. and Markovian SA procedures has recently attracted much attention.


Uncertainty quantification for distributed regression

arXiv.org Machine Learning

The ever-growing size of the datasets renders well-studied learning techniques, such as Kernel Ridge Regression, inapplicable, posing a serious computational challenge. Divide-and-conquer is a common remedy, suggesting to split the dataset into disjoint partitions, obtain the local estimates and average them, it allows to scale-up an otherwise ineffective base approach. In the current study we suggest a fully data-driven approach to quantify uncertainty of the averaged estimator. Namely, we construct simultaneous element-wise confidence bands for the predictions yielded by the averaged estimator on a given deterministic prediction set. The novel approach features rigorous theoretical guaranties for a wide class of base learners with Kernel Ridge regression being a special case. As a by-product of our analysis we also obtain a sup-norm consistency result for the divide-and-conquer Kernel Ridge Regression. The simulation study supports the theoretical findings.